Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.
نویسندگان
چکیده
The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.
منابع مشابه
Bioinformatics Genome Cluster Database . A Sequence Family Analysis Platform for Arabidopsis and Rice 1
The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified f...
متن کاملIsolation and molecular characterization of the RecQsim gene in Arabidopsis, rice (Oryza sativa) and rape (Brassica napus)
In any organism that reproduces sexually, DNA Recombination plays vital roles in the generation of allelic diversity as well as in preservation of genome fidelity. Genome fidelity is particularly important in plants because mutations occurring during the development of flowering plants are heritable and can be passed onto the next generation. One of the gene families that play crucial roles in ...
متن کاملFunctional analysis of rice bidirectional promoters
.................................................................................................................................. 11 Literature review .................................................................................................................... 12 CHAPTER 1: .....................................................................................................................
متن کاملGenome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis.
The basic/helix-loop-helix (bHLH) transcription factors and their homologs form a large family in plant and animal genomes. They are known to play important roles in the specification of tissue types in animals. On the other hand, few plant bHLH proteins have been studied functionally. Recent completion of whole genome sequences of model plants Arabidopsis (Arabidopsis thaliana) and rice (Oryza...
متن کاملComparative plant genomics resources at PlantGDB.
PlantGDB (http://www.plantgdb.org/) is a database of plant molecular sequences. Expressed sequence tag (EST) sequences are assembled into contigs that represent tentative unique genes. EST contigs are functionally annotated with information derived from known protein sequences that are highly similar to the putative translation products. Tentative Gene Ontology terms are assigned to match those...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Plant physiology
دوره 138 1 شماره
صفحات -
تاریخ انتشار 2005